# Full text of "A comparison of single and simultaneous equation techniques"

## See other formats

```Historic, Archive Document

i

Do not assume content reflects current
scientific knowledge, policies, or practices.

UNITED STATES DEPARTMENT OF AGRICULTURE -

AGRICULTURAL MARKETING SERVICE

A COMPARISON OP SINGLE AND SIMULTANEOUS EQUATION TECHNIQ.UES*

tiy

Richard J. Foote y;

In my opinion, much needless confusion exists in the minds of
economists and statisticians when they think about least squares versus
simultaneous equation techniques. Some analysts believe that the method
of least squares now is completely outmoded; others feel that simultane-
ous equation methods are so complex and computationally expensive that
they should be avoided whenever possible* Each of these viewpoints is
wrong. Simultaneous equation techniques are a useful addition to our kit
of tools for use in problems that deal with regression analysis. When
they are needed, they should be used, Just as a hack saw is used to cut
metal, whereas various woodsaws can be used to cut wood. In systems of
equations, one, s,^yaral, or perhaps all frequently can be fitted by least
squares. Moreover, least squares equations are useful now. Just as they
always have been, in showing normal or average relationships that exist
between sets of variables. Many problems that relate to regression
analysis, such as choice of variables, location of data, choice of f\mc-
tional forms, and the testing and interpretation of results, are almost
identical regardless of whether the equations are to be fitted by least
squares or by simultaneous equation techniques.

Because of the confusion that exists, I propose to stairt with some
extremely elementary concepts with which I am sure you are all familiar.
We shall then proceed step by step in such a way as to show precisely
when and why simultaneous equation techniques are needed. In later sec-
tions, I shall discuss some computational aspects of regression analysis
and some considerations relating to the degree of complexity that -may be
desirable in formulating a system of economic relationships.

Some Economic Considerations

'In 1927; Elmer Working gave an excellent discussion of what now
is called the identification problem in his classic paper "What Do
Statistical ’Demand Curves’ Show?".l/ He pointed out that when a research
worker begins a demand study, he is confronted with a set of dots like
that shown in section A of the chart on p. 3 . He knows that each can
be thought of as the intersection of a demand and a supply curve, as in
section B, but, without further information, neither curve can be deter-
mined from the data. Working then noted that if the demand curve has

* Prepared for presentation before the American Farm Economic Associa-
tion, August 1, 1955* The author wishes to thank Glen Burrows of the
Agricultural Marketing Service for a number of helpful suggestions on
the statistical aspects of this paper.

1/ Quart. Jour. Econ. 41:212-235, I 927 . A similar line of reasoning
is followed by Koopmans, TJalling C, Identification Problems in Economic
Model Construction, pp. 27-35* This is given as Chap. II of Studies in
Econometric Method. Cowles Commission for Research in Economics Monogr.

1^+, 1953. ;■

Agr icultur e-Washington

July 1955

shifted over time but the supply curve has remained relatively stable,
as in section C, the dots trace out a supply cxirve; conversely, if the
supply ciorve has shifted but the demand curve has remained stable, as in
section D, the dots trace out a demand curve. If correlated shifts for
each curve have taken place, as in section E, the dots trace out what may
look like a structural demand or supply curve, but the slope will be too
flat or too steep.

In many analyses of the demand for agricultural products, factors
that cause the demand curve to shift over time are included as separate
variables in a multiple regression equation. In effect, we are then able
to derive from our estimating equation an average demand curve. This is
indicated in a rough way in section F. In some analyses, we can assume
that the quantity supplied is essentially vinaffected by current price.

V/hen price is plotted on the vertical scale, the supply curve in such
cases is a vertical line, and year-to-year shifts in the supply curve
trace out a demand curve, Just as they did in section D. Under these
circumstances we may be able to obtain valid estimates of the elasticity
of demand by use of a least squares multiple regression analysis for which
price is the dependent variable and supply and scane demand shifters are
used as independent variables. This point was noted by Working in his
1927 paper 2/, emphasized by Ezekiel in a paper published in I928 3/^ and
reconsidered in 1953 "by Fox 4/ in the light of modem simultaneous""equa-
tions theory. For many agricultural products, this set of circumstances
permits us to estimate elasticities of demand with respect to price by
use of single equation methods. Two points, however, should be kept in
mind: (l) Price must be used as the dependent variable in order to obtain

elasticity estimates that are statistically consistent since, to use the
least squares technique, the supply curve must be a vertical line; and
(2) an algebraic transformation must -be made after the equation has been.,.-
fitted to derive the appropriate coefficient of elasticity, since the
definition is 'in terms of the percentage change in quantity associated
with a given percentage change in price. Other circumstances mder which
least squares equations can be used to derive coefficients of elasticity
are discussed in a later section.

What happens if we have a supply curve that is not a vertical line?
If we consider any single point, as in section G, we have no way of know-:^
ing on which demand and supply curve of a whole family of curves it lies.
The basic problem of indeterminateness is similar to that in which corre-
lated shifts in the demand and supply curves take place. What is needed
is some hypothesis, adequately tested and proven to be sound, as to the
nature of the Joint relationships between supply and demand. We should
then be able to untangle the two and to obtain a reliable estimate of the
slope of each curve. This is essentially what is involved in the simul-
taneous equations approach. This concept was set forth by Haavelmo in
1943* 5/ Staff members of the Cowles Commission spent a considerable part
of the next 10 years in showing how to implement it when working with
actual data.

* i

2/ Op. Cit. , p. 223. 3/ Ezekiel, Mordecai. Statistical Analyses and the,

"Laws” of Price. Quart . Jour. Econ. 42:199-225- 4/ Fox, Karl A. The

Analysis of Demand„fpr Farm Products. U. S. Dept. Agr. Tech. Bui. 1081.

5/ Haavelmo, Trygve. The Statistical Implications of a System of Simulta>-
neous Equations. Econometrica. 11:1-12, 1943- A more complete discussion
by the same, author is given in The Probability Approach to Econometrics.
Econometrica, Vol. 12, Supplement, 1944.

- 3 -

SUPPLY-DEMAND RELATIONSHIPS

SECTION A

SECTION B

FREQUENCY SECTION H

True /3|

301* Limited
. information
method N

Least

squares

-« .46 .48 .50 .52 .54 +«

/ ESTIMATE OF ^ I

U. S. DEPARTMENT OF AGRICULTURE

NEG. 1695-55 (6) AGRICULTURAL MARKETING SERVICE

« 4 -

Suppose, however, that the analyst has no interest in the true
demand and supply curves hut only wants a method that will assist him in
studying probable future trends in prices. Working had some suggestions
on this point, too. He said, "It does not follow from the foregoing
analysis that, when conditions are such that shifts of the supply and de-
mand curves are correlated, an attempt to construct a demand curve will
give a result that will be useless. Even though shifts of the supply and
demand curves are correlated, a curve which is fitted to the points of
intersection will be useful for purposes of price forecasting, provided
no new factors are introduced which did not affect the price during the
period of study, Thus, so long as the shifts of the supply and demand '
curves remain correlated in the same way, and so long as they shift
through approximately the same range, the curve of regression of price
upon quantity can be used as a means of estimating price from quantity. "6/'

The problem here is that the shifts almost never "remain correlated
in the same way" over a sufficiently long period to generate enough data
to fit our equation. In some circumstances, changes in structure are so
frequent that multiple regression equations almost always yield low cor-
relations and frequently even "wrong" signs on the coefficients. This is
particularly apparent when we attempt, by the single equation approach,
to study factors that affect volume of exports. In other cases, changes
in structure are of minor importance and least squares equations may
yield completely satisfactory results in terms of showing relationships
that have prevailed between simultaneously-determined economic variables
over a considerable period of time. This is frequently true when we
study relationships between prices at specified locations or at local
market, wholesale, or retail levels. Even here, however, the analyst
should examine his results closely, perhaps by plotting the data in
scatter diagrams, to determine whether changes in structure have affected
the relationship. The in-between case is the one that can be dangerous.
Here the coefficients may suggest that the analysis is satisfactory; it
may, in fact, be of little value for the study of future trends.

Marschak jJ gives an interesting example of the importance of
changes in strucutre on the need for using a complete system of equations.
He considers the old problem of taxation of a monopoly. He points out,
"Knowledge is useful if it helps to make the best decisions". He con-
siders, among other things, the kinds of knowledge that are useful to
guide the firm in its choice of the most profitable output level. If
the tax rate has not changed in the past and is not expected to change,
the firm can fit an empirical curve to observed data on output and profits
and immediately derive the point of maximum revenue. If the tax rate has '
not changed in the past but is expected to cheuige in the fut^ire, the firm
could, if it so desired, vary its output and profits under the new tax
structure and derive a new empirical relation. But this takes time, and
substantial losses might occur during the experimental period. If the
firm had taken the trouble to derive the strucutral demand and net revenue

p. 227.

7/ Mau^sohak, Jacob. Economic Measurements for Policy and Prediction,
Chap. I in Studies in Econometric Method, Cowles Commission for Research
in Economics Monogr. l4, 1953.

i^urvesji.it could immediately have deteriHined-'its most profitable output
under :':the new tax structiare. If the tax rate had varied during the ini-
tial period, an empirical regression of net revenue on output and the tax
rate could have been fitted and used to find the most profitable output
under the new tax structure. In many real life situations, changes in
structure are frequent. Hence Marschak concludes: ’'A theory may appear
unnecessary for policy decisions (or forecasting) until a certain struc-
tural change is expected or intended. It becomes necessary then. Since
it is difficult to specify in advance what structural changes may be
visualized later, it is almost certain that a broad analysis of economic
structure, later to be filled cut in detail according to needs, is not
a wasted effort".

This argument in no way invalidates the use of a single equation
to estimate elasticities of demand in these cases in which the supply can
be considered as unaffected by current price. In such cases, we may
obtain estimates of the structural parameters that can be used in the
same way as anyi.other statistically- valid estimates. Instead, Marschak
is arguing -ithat only rarely should the economic analyst be satisfied
with' a purely empirical fit if he can obtain structural relationships

Some S tatistical Considerations

We now turn to some statistical considerations that have a bearing
on the extent to which we can use least squares to estimate the coeffi-
cients in a given equation. In the relationships discussed in the pre-
ceding section, we have assumed, more or less implicitly, that the points
lie exactly on the demand or supply curve. In actual statistical analyse
this is never true, since some variables that cause the curves to shift
always are omitted and the precise^ shape of the curves to be fitted are
not known. Thus we assume that we are dealing with stochastic rather than
functional relations. A stochastic relation basically is one that in-
cludes a set of unexplained residuals or error terms whose direction and
magnitude are usually not known exactly for any particular set of calcu-
lations, but whose behavior on the average over repeated samples can be
described or assumed.

In order to have a concrete example about which to talk, let us
consider the following equation:

Y = a + b + ligZg + u (i)

^ Here Y is the variable for which as estimate is desired, the Z's are 2
variables which are known to affect Y, and u is an error term. We assume
that.^:|or a number of periods we know the vedue- of Y and the Z's and we
wish estimate a, b^^, and We do not know the value of u but can

estimate" ii iu a rough way for any given period as the difference between
the value of Y computed from the equiation and its actual value.

We know that estimates of the regression coefficients will differ
for different sets of observations. However, we would like to estimate
them in such a way that the average value for a large n'umber of periods or
samples equals the value that would be obtained from a similar calculation
based on the combined evidence of all possible samples.. Estimates of this
sort are known statistically as unbiased estimates. We also would like the
variation of the estimates about their average or tine value to be as small
as possible, since under this circumstance we would have more confidence
in any single estimate than if we had a large amount of variation. Esti-
mating procedures that give the smallest possible variance are known as
best estimates. Despite their name, such estimates possess no more desir-
able properties than many alternative estimates. So the choice of ter-
minology is unfortunate, but it has become firmly established. In certain i
circijmstances, we may be unable to obtain best unbiased estimates but may
be able to obtain estimates that are consistent and efficient .> A consistent
estimate is one that is unbiased when we work with all the possible data;
it may or may not be biased in small samples . ' In actual practice, of
coiorse, we never have all possible data, but estimation procedures that
give consistent estimates presumably are better even with small samples
than are those that are known to be biased even with an infinitely large
sample. Efficient estimates are similar to ’’best" estimates, except that
they are known to give the smallest possible variance only when we work
with all possible data.

* *. ' 1

If we use the method of least squares to estimate .the. coefficients
in equation (l), we obtain best unbiased estimates if the u's and Z's meet
certain rather rigid specifications. Some of the specifications that re-
late to the u's are difficult to state precisely in nonmathematical terms,
but essentially they require that the u's follow some (not necessarily a
normal) probability distribution, that”"their average or expected value be
zero, that their variance be finite and independent of the Z‘S (the latter
is the property of homoscedasticity, for those of you who remember your
elementary statistics), and, finally, they must be serially independent.

When working with economic data, we usually assvime that these assumptions
hold; but we may test at least the one regarding serial independence of
the residuals after we have run the analysis. In some cases, we transform
the data to (l) logarithms, which frequently helps to render the variance
of u less dependent upon the Z's, or (2) first differences, which when work-
ing with economic data frequently tends to reduce the serial correlation
in the u’s. As an alternative we may, of course, use first differences
of logarithms.

In addition there is. a specification regarding- the Z's- that is easily
stated but frequently disregarded by economists. To be certain that the
least squares approach will give' best unbiased estimates, each Z must be i
— hnown numbers , .in contrast to a random variable. When attempting

to obtain elasticities of demand, this is true only in rare instances. The
only case of which I can think is the one for which prices are arbitrarily
set at certain levels, as in a retail store experiment, and the quantity
bought by consumers at these prices are recorded. I know of only one ex-
periment that has been conducted in this way--that for oranges by Godwin. 8/
The least squares method was developed for use in connection with experiments

8/ Godwin, Marshall R. Customer Response to Varying Prices for Florida
Oranges. Fla. Agr. Expt. Sta. Bui. 508. 1952.

- 7. -

in the physical sciences, where the independent variables frequently are
sets of kncnm niambers, or to study relationships between variables, such
as heights of fathers and heights of sons, where no. ’’structural'',
cients are involved.

Econometricians have shown that the least squares approach will give
estimates of the regression coefficients that are statistically consistent
and efficient, provided the u's meet approximately, the same requirements
as for the previous case, if the Z’s are predetermined variables. A sim-
plified proof of this is given by*” Klein 9 A although a fairly advanced
knowledge of calculus and of the principles of probability given in chap-
ter 2 of his book are required to follow his development. Just as the
term "best” as used by statisticians is an unfortunate one, the term
"predetermined" as used by econcanetri clans is unfortunate. But as it now
is commonly used in the literature, it seems advisable to learn what it
means so that we can continue to use it. Predetermined variables include
those that the analyst takes as given, while endogenous variables are those
that are determined simultaneously by the same set of economic forces and,
therefore, are to be estimated from the model. The predetermined variables
commonly are divided into exogenous variables and lagged values of endo-
genous variables. Lagged values of the endogenous variables, naturally,
are values of these variables for a previous time period. Exogenous
variables include all other variables that might enter into an analysis.
Weather is a commonly cited exogenous variable. For many variables, the
exact economic structure of the segment of the economy being studied must
be carefully considered to determine whether they should be classified as
endogenous or exogenous, and frequently different analysts. will classify
them in different ways.

We now can reconsider an example cited earlier. In section F we
showed a diagrammatic representation of a situation for which the least
squares method could be used to estimate the slope of the demand curve.

We now know that this estimate will be statistically consistent and effi-
cient only if the quantity consumed and the demand shifters each can be
classified as an exogenous or lagged endogenous variable. Fox 10/ has
argued thdt this is approximately true for a considerable number of agri-
cultural products, including meat, poultry and eggs, feed grains, and a
number of fresh fruits and vegetables. Under the assumptions of Fox,
market price is used as the dependent variable and the independent vari-
ables are supply (which is assumed to be highly correlated with consumption)
and some relevant demand shifters, including usually disposable income.
Another situation under which we can use the method of least squares to
estimate elasticities of demand is that for which data are available on
purchases or consumption of individual consumers, as prices that confront
consumers are determined chiefly by factors other than those that affect
their purchases. In this case, consumption is taken as the dependent
variable and retail prices of the various items, family income, and per-
haps other household characteristics, are taken as independent variables.

9/ Klein, Lawrence. R. A Textbook of Econometrics. . Row, Peterson and
Co., 1953; PP- 80-85.

10/ Cit.

- 8 -

One further problem needs to be considered before leaving the
statistical aspects of this subject. In all cases given previously, we
have assumed implicitly that the variables are known without error. Any
analyst who has been connected with the compilation of data from original
sources kpows that errors of one sort or another always creep in. These
can result from memory bias on the part of respondents, inability to find
all of the people in a complete census, errors of sampling, and a host
of other reasons beyond the control of the most careful investigator.
Whenever we work with economic series, nonnegligible errors in the data
are known to exist.

Fox has suggested an easy way to correct for this, provided we are
willing to make an informed guess about the average magnitude of the
errors, 11 / Such guesses or estimates can be made by carefully reac^ng'
how the series was compiled, or preferable, by talking to the persQp ih!'
charge of the compilation. For most series, we have a fairly good idea
as to whether the error is in terms of, say, a few percent, 10 to 20 per-
cent, or something higher. If we ^sh, we can experiment idth various
assumptions about the level of error, compare the effect on our coef-
ficients, and then arrive at a rather good idea as to the effect which
likely errors may have on their magnitude.

Because of the importance of a correction of this sort, I would
like to show you exactly what is involved in an extremely simple case.

All of you know, I am sure, that when we express variables in terms of
deviations from their respective means, the least squares regression
coefficient between two variables is given by:

Syx

Suppose that y and x are each subject to error and that the magnitude of
the error is given by d and e, respectively. The sample regression coef-
ficient then equals: ~

s

S(y*d)(x+e)

S(x+e)2

Let us now expand the numerator and denominator® We obtain:

2(y+d)(x+e) « Syx + Sye + Sdx + Sde
S(x+e)2 « Sx^ + 2Sxe + Se^

11 / This approach and some examples based on it are described in some
detail by Foote, Richard J. and Fox, Karl A, Analyiiical Tools for
Measuring Demand. U.S. Dept. Agr. Agr. Handb. 6U, 195U, pp. 29-35 •

_If d and'e are random a:id independent^ and we have a large sample,
any sd^ation" term that involves one cr "both of them as a cross-product
equals approximately sere. Summation terms that involve their squares,
howeyer:)-"do not equal zero. Applying this principle, when the variables
are. sub ject to error, we can write the value of the regression coeffi-
cient in the following way:

*

byx di:

zyx

Q ' P

+ Se^

This gives a biased estitnate of by^ because the denominator is too large.
But if we reduce the sum of squares for the independent variable by an
amount proportional, to the assumed percentage error, then we obtain an
estimate of the regression coefficient that is approximately unbiased
provided the only source of bias is that due to errors in the data. Such
a correction can be made easily. In working with multiple regression
coefficients, some formulas involve sums of squares for the dependent
variable also, so all sums of squares that enter into the analysis should
be corrected in a similar way. I see no reason why the same sort of cor-
rection could not be applied to the sums of squares that are involved
in equations to be fitted by the simultaneous equations approach.

Some Bconometric Considerations

Discussion in the preceding section suggests, and econometricians
have shown, that if we use the least squares approach to estimate the
regression coefficients in an equation that contains current values of
2 or more endogenous variables, we obtain estimates that are statisti-
cally biased. The mathematical nature of the bias has been shown by a
number of authors in a supposedly popular way, 12/ but I have yet to
find an explanation that is completely satisfactory for a nonmathemati-
cian. We do> however, have some experimental evidence of the kind of
bias that results when the method cf least squares is applied in such
cases. Methods which have been developed to handle equations that con-
tain 2 or more endogenous variables are known to give estimates that are
statistically consistent, but methods are not now available that are
known to be statistically unbiased. Thus, we cannot say for sure what
happens when we work with the small samples tha.t usually are involved
in economic research.

An experiment was designed to measure the kind of bias that arises
when we apply these methods to small samples 13 / and, as a byproduct of
this experiment, we have seme concrete evidence of the kind cf bias that
may arise when we use the method of least squares instead,

12/ See, for example, Bronfenbrenner, Jean. Sources and Size of Least-
Squares Bias in a Two-Equation Model, Chap. IX in Studies in Econometric
Method. Cowles Commission for Research in Economics Monogr. l4, 1953j
Bennion, E.G. The Cowles Commission's "Simultaneous-Equation Approach";

A Simplified Explanation. Review Econ. and Statis., 34:4-9-56, 1952; and
Meyer, John R, and Miller, Henry Lawrence Jr. Some Comments on the
"Simultaneous-Equations" Approach. Review Econ. and Statis, 36:88-92.1954.

13 / Wagner, Harvey M. A Monte Carlo Study of Estimates of Simultaneous
Linear Structural Equations. Tech. Rpt. 12, Dept, of Econ., Stanford
Univ. 1954 .

-* 10

In this experiment, a simple 3 -equation model was formulated with
known coefficients. Variables generated by the model were obtained and
random error terns added to them. Two thousand observations were
obtained in this way, and they were then divided into 100 samples of 20
observations each. The first equation contains 2 endogenous variables.
Regression coefficients were obtained for this equation for each sample
by the limited information approach, which I will discuss later, and by
the method of least squares. Since there were 100 samples, 100 separate
estimates of the single regression coefficient involved was obtained by
each approach. Frequency distributions of these estimates are shown in
section H of the chart, together with the true value of the coefficient.
Each method gives estimates that are biased, but the 3 highest frequen-
cies for the limited information approach are grouped about the true
value, whereas the 3 highest frequencies for the least squares approach
each are to the right of the true value. The average bias of the least
squares estimates was almost 3 times as large as that for the limited
information approach. In fairness, I should point out that for a second
version of this model the 2 frequency distributions are more nearly
similar, but the average bias in the least squares estimates still was
almost double that for the limited information estimates. This study
was carried out at Stanford University by making use of a largescale
electronic computer. It is to be hoped that further e:qoeriments of
similar character, with different sorts of models, will be undertaken.

In discussing methods that are used to handle equations that
involve more than 1 endogenous variable, it is convenient to introduce a
mathematical concept that deals with the degree of identification . We
saw earlier that it is sometimes impossible to estimate the coefficients
in certain structural equations with the kind of statistical data
available to do the job. Such equations ere said to lack identifi-
ability or to be underidentified . Identifiable equations, however, may
be just identified or overidentified « In the discussion that follows,
we deal only with identifiable equations. The degree of identification
relates to individual equations in a system, not to the entire system.

By algebraic manipulation, we always can write down the equations
in a complete system so that the number of equations equals the number of
endogenous variables. We then can think of these as n equations in n
unknown endogenous variables, and we can always solve the equations so
that each endogenous variable is"e:q)re6sed as a function of all of the
predetermined variables in the system. These are called reduced form
equations. Since each reduced form equation contains only a single
endogenous variable, we can obtain estimates of the regression coef-
ficients in these equations that are statistically consistent and
efficient by use of the method of least squares.

If each equation in the system is just Identified, there is
always a unique transformation by which we can go from the coefficients
in the reduced form equations to the coefficients in the structural
equations. I am sure that many of you have seen this transformation for
simple systems of equations, but perhaps it is worthwhile to run through
an example to make s\ire that its nature is clear to everyone. Suppose
that we have the following structural demand and supply equations, where

- 11

£ and 2 have their usual meaning; y is consumer income, v represents
important weather factors that affect supply, and each variable is
e:q)ressed in terms of deviations from its respective mean:

p + bj^y (2)

?» • '

q. = h^2P ^22^

We have h variables in the system and 3. variables in each equation;
since 4-3 equals the number of endogenous variables in the system ininus
one, we can assume that each equation is just identified. Several rules
of thumb like this are available to determine .the degree of identifi-
cation; more exact rules depend on the rank of certain matrices.

If we substitute the right-hand side of equation (3) tor q in
equation (2) and the right-hand side of equation (2) for p in equation
(3) and simplify terms, we obtain the following:

p = ^11^22 w + ^12 ,y (4)

l-biib2i .

q = ^22 w + °12^21 y (5)

^‘^ 11^21 ,

Since the denominator of each term' on the right of the equality sign is
identical, we can ignore these denominators for the moment. If we
divide the coefficient of w in equation (4) by the coefficient of. w in
equation (5), we obtain an estimate of, bi i ^ If we divide the coef-
ficient of y in equation (5) by the coefficient of y in equation (4)^ we
obtain an estimate of bg^i Given an estimate of h-\ i and we can

estimate bnp from the coefficient of y in equation t4) and bpg from the
coefficient of w in equation (5). This gives the 4 regression coef-^
ficients needed for our structural equations. Estimates that are
uniquely equivalent are obtained by any alternative algebraic manipu-
lation. Since the b’s are known to be statistically consistent estimates^
the estimates of the structiiral coefficients obtained in this way are
statistically consistent, '

Computationally, 'we may 'wish to estimate the coefficients in
another way, but the answers ’obtained are identical to those that would
have been gotten by an algebraic manipulation of the regression coef-
ficients from the reduced form equations.

Let us now consider an equation that is overidentified. Suppose
that our supply equation contains a second predetermined variable, z,
that represents lagged values of prices. We now have 5 variables in the
system. The -supply equation still is just identified, since 5-4 equals
the number of endogenous variables in the system minus one. However,
the demand equation is over identified, since 5-3 is greater than 2-1,

- 12 .

The new reduced form equations can he obtained by the same general
approach as used previously^ but the result now looks like this;

b b

p = 11 22

b b

w + 11 23

b_ _

z + 12 y

(6)

^•^11^21

l-b^b2i

q = ^^22

w + ^23

z + ’’l2’’21 y

(7)

b-biibgi

l-bi3_b2x

With this set of equations, could be estimated either by dividing. the
coefficient of w in equation t^>) by the coefficient of w in equation (7)
or by.dividing the coefficient of z in equation (6) by the coefficient of
z in equation (?). Different answers are obtained from the 2 estimates.

It is in this way that overidentified equations differ from just identi- .
fied ones; for overidentified equations, we have an oversufficiency of
information and no direct way to decide which answer to use. In fact,
neither answer obtained by the use of reduced form equations is
statistically consistent.

It would be possible to solve the 2 structural equations directly
for the several coefficients involved by making use of a maximim
likelihood a,pproach. Maximum likelihood estimates are known to be
statistically consistent and efficient. They are used widely in statis-
tical work because the necessary equations always can be derived by
performing certain mathematical operations that involve the maximization
of the socalled likelihood function. The general approach is the same
as for any maximization process by use of calculus and it is not
difficult. For complex systems of equations, however, the mathematics
involved in solving the resulting equations is generally complex. That
part of Klein referred to in footnote 9' iiivolved the derivation of
maximum likelihood estimates. Methods for obtaining maximum likelihood
estimates based on a simu3.taneous’ solution for all of .the structural
equations are discussed by Klein and in Cowles Commission Monograph Inl-
and are called full-information maximum likelihood estimates but, to
quote Klein, the computations involved in general are ’’formidable."

Hence this method is seldom used.

Another method, developed by staff members of the Cowles
Commission, is called the single -equation limit ed - informat ion maximum-
likelihood method. In this approach, equations are fitted one at a
time and information regarding variables that appear in each of the ^
other equations of the system is disregarded. Use is made, however,
of all endogenous and predetermined variables that appear in the
equation and of all other predetermined variables that appear in the
system. This method is known to give esti^tes of the regression
coefficients that are statistically consistent and as efficient as » .

any other method that utilizes bhe same amount of information. A

•;ri

- 13 -

con^promls^ -betwen this feind the full-information method is one called
^ the limited -information subsystem method, i^ which, selected groups of
equations; are fitted as a unit. Most of th^ systems .of simultaneous
equations that have been fitted and that involve overidentified equations
have been based on the single -equation limited information approach.

This is the approach that generally is meant when reference is made to
the. .use of the limited information method.

1

For some applications of the limited information approach, I
would like to call your attention to Cowles Commission Monograph 15
by Hildreth and Jarrett 14/ dealing with the livestock economy; Iowa
■Research Bulletin 4l0 by Nordin, Judge, and Wahby 15/ dealing with
livestock products; and USDA Technical Bulletin II 3 S by Meinken 16 / on
wheat, to be issued by us this fall. This approach has been used in ’
several other studies,' but each of the.se devotes considerable attention

■^Some Computational Considerations

If certain equations in a system of equations involve only a
single endogenous variable, these can be fitted by the least squares
approach. If other equations- are just identified, they can be fitted
by the method of reduced forms. If still other equations are over-
identified, they can be fitted by the single -equation limited information
approach. If the full information, approach were to be used, all equations
would be fitted simultaneously. But, 'even if, in essence, each equation
is to be fitted separately, similar' computations based on the same data
frequently are involved for each analysis , • Hence considerable compu-
tational time may be saved by handling the computations as a unit.

'Even if several equations can be fitted directly by least squares and
no other equations are involved, time frequently is Saved by carrying
' out the computations as a simultaneous operation for the group. Which-
ever approach is used, the answer obtained is the same; here we are
concerned only vrlth computational efficiency.

If time permitted, I would give some concrete examples to
'illustrate this point. - We have in process a computational handbook in
which several such exair^les are discussed in detail. 17 / 1 woizld like

to tell you just what 4s included in this handbook. First we< describe
a new method for handling ordinary multiple ‘regression analyses. To
quote the preface, "This method is believed to=be more efficient than

' 14/ Hiidreth, Clifford, and Jarrett, Frank. - A Statistical Study of

Livestock Production and lyfeirketing. Cowles Commission for Research in
' Economics "Monog, 15 . 1955* .

15 / Nordin, J. A,, Judge, George G., and Wahby, Omar. Application of
Econometric Procedures to the Demands for Agricultural Products. Iowa
Agr. Ej^t, Sta. Research' Bui. 4l0, July 1954.

16 / Meinken, Kenneth W; The Demand and Price Structure for Wheat. U. S.
Dept. Agr, Tech. Bul^^4136. ‘ (In press.)

17 / Friedman, Joan and Foote, Richard J. Computational Procedures for
Handling Systems of Simultaneous Equations.

those now concnonly in use and is easier for beginners to understand than
those based essentially on the Doolittle approach^ since no back solution
is required," Then we consider methods for handling equations that
involve 2 or more endogenous variables, using essentially the same
procedure for equations that are either Just identified or overidentified.
We discuss in detail the exact steps required for equations containing
specified numbers of endogenous and predetermined variables. Then we
consider the previously mentioned exaniples of complex systems of
equations. In each case, we show how to obtain the coefficients and
their respective standard errors. Finally, we take up tests that
should be made after the analyses are completed, correlation concepts
that apply to systems of equations, and the use of systems of equations
for analytical purposes or forecasting. To quote the preface again,

"This handbook is designed to provide a complete description of the
steps involved in the more common types of problems and to illustrate
them in a way that will be clear to research and clerical workers who
have an acquaintance only with standard methods for handling single -
equation multiple -regression analyses." The handbook should be
available late this fall. Carbon copies of it are already in use in
our central computing unit.

Formulation of Models

In my opinion, there should be no question in the minds of
research analysts as to whether they should use single -equation or
simultaneous -equation methods for particular equations or groups of
equations, but decisions must be reached regarding the complexity of
the model to be used and the degree of aggregation. Klein has a use-
ful suggestion here in connection with his discussion of sector models,

18 / He suggests that a master model for the entire national economy be
formulated. As I am sure most of you know, he and staff workers at the
University of Michigan have made some progress in formulating models of
this type. Models for individual industries, individual commodities, or
individual regions can then be formulated and fitted as separate entities
in such a way that they can be "grafted" onto the master model. Grafting
of this sort is being carried out at the University of Michigan and else-
where. In our work for exanple, we have tended to treat variables that
relate to the national economy, such as diS;posable income, as though
they were predetermined. With Klein's approach, (l) the computed value
of disposable income for each observation, based on the master model, •
could be used as a predetermined; Variable, or ( 2 ) disposable income
could be treated as an endogenous variable and the predetermined r

variables on which it depends brought in as predetermined variables in
the system for the sector. Either method yields consistent estiinates’,
whereas the treatment of disposable income, as such, as a predetermined
variable results in some bias.

The same general approach could be used in narrower fields i^-' Fbr-
exampie, we plan to fit some national aggregate simultaneous ’equation
models for the feed -livestock economy in which all feeds and' all 'livestock
and livestock products will be aggregated. The model Mil be similar to

. - 15 -

that used by Hildreth and Jarrett hut will contain what we believe to be
some important modifications. At the same time we will be fitting models
that relate to individual feeds and to individual types of livestock,
either for the country as a whole or for specified regions. Later, less
aggregative models may be fitted for separate regions, with separate
equations for each major type of livestock, and these regional models
then coiild be grafted onto the master feed -livestock economy model,
which in turn could be grafted onto a master model for the entire
economy. Thus, over time, a series of studies could be built up, each
of which would be manageable as a research unit, but all of which would
tie together to make a united whole.

Concluding Remarks

I regret that time has not permitted the presentation of con-
crete exan^les of the formulation and use of simultaneous systems of
equations, but the references which I cited earlier contain some
excellent examples of these, A complete description of any one of them
would have required at least the full time allotted to me today, I hope
that my remarks have cleared up some of the concepts and terminology
used in this area by modern econometricians. In formulating and fitting
complex models, the nonmathematician will do well to seek the advice of
a competent expert in this area. Fortunately, our graduate schools are
turning out an increasing number of such e3<perts. However, it is my
belief that anyone who is willing to give the matter careful thought
can understand the basic concepts involved and can conduct applied
research in such a way that he can use these concepts to advantage in
his work with only the occasional help of a mathematician.

```